{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"NLU_training_multi_class_text_classifier_demo_wine.ipynb","provenance":[],"collapsed_sections":["zkufh760uvF3"]},"kernelspec":{"display_name":"Python 3","name":"python3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"zkufh760uvF3"},"source":["![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)\n","\n","[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/Training/multi_class_text_classification/NLU_training_multi_class_text_classifier_demo_wine.ipynb)\n","\n","\n","\n","# Training a Deep Learning Classifier with NLU \n","## ClassifierDL (Multi-class Text Classification)\n","## 4 class WineEnthusiast Wine review classifier training\n","With the [ClassifierDL model](https://nlp.johnsnowlabs.com/docs/en/annotators#classifierdl-multi-class-text-classification) from Spark NLP you can achieve State Of the Art results on any multi class text classification problem \n","\n","This notebook showcases the following features : \n","\n","- How to train the deep learning classifier\n","- How to store a pipeline to disk\n","- How to load the pipeline from disk (Enables NLU offline mode)\n","\n","\n","You can achieve these results or even better on this dataset with training data:\n","\n","<br>\n","\n","![image.png]()\n","\n","You can achieve these results or even better on this dataset with test data:\n","\n","<br>\n","\n","\n","![image.png]()\n","\n"]},{"cell_type":"markdown","metadata":{"id":"dur2drhW5Rvi"},"source":["# 1. Install Java 8 and NLU"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"hFGnBCHavltY","executionInfo":{"elapsed":32398,"status":"ok","timestamp":1620188669127,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"be856a07-ea44-4bfe-c741-1e6674ae24d6"},"source":["!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash\n","  \n","\n","import nlu"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2021-05-05 04:23:57--  https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 1671 (1.6K) [text/plain]\n","Saving to: ‘STDOUT’\n","\n","\r-                     0%[                    ]       0  --.-KB/s               Installing  NLU 3.0.0 with  PySpark 3.0.2 and Spark NLP 3.0.1 for Google Colab ...\n","\r-                   100%[===================>]   1.63K  --.-KB/s    in 0.001s  \n","\n","2021-05-05 04:23:57 (2.60 MB/s) - written to stdout [1671/1671]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"f4KkTfnR5Ugg"},"source":["# 2. Download wine review dataset \n","https://www.kaggle.com/zynicide/wine-reviews\n","dataset with products between 5 review classes"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"OrVb5ZMvvrQD","executionInfo":{"elapsed":33566,"status":"ok","timestamp":1620188670320,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"eb56146a-5ff2-4da7-da31-1fa504f3659c"},"source":["! wget http://ckl-it.de/wp-content/uploads/2021/01/winemag-data_first150k.csv\n"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2021-05-05 04:24:28--  http://ckl-it.de/wp-content/uploads/2021/01/winemag-data_first150k.csv\n","Resolving ckl-it.de (ckl-it.de)... 217.160.0.108, 2001:8d8:100f:f000::209\n","Connecting to ckl-it.de (ckl-it.de)|217.160.0.108|:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 1447273 (1.4M) [text/csv]\n","Saving to: ‘winemag-data_first150k.csv.2’\n","\n","winemag-data_first1 100%[===================>]   1.38M  1.91MB/s    in 0.7s    \n","\n","2021-05-05 04:24:29 (1.91 MB/s) - ‘winemag-data_first150k.csv.2’ saved [1447273/1447273]\n","\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/","height":419},"id":"y4xSRWIhwT28","executionInfo":{"elapsed":34170,"status":"ok","timestamp":1620188670949,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"cc9da822-9e17-4a08-bd28-a82a42c1bd2d"},"source":["import pandas as pd\n","test_path = '/content/winemag-data_first150k.csv'\n","train_df = pd.read_csv(test_path,sep=\",\")\n","cols = [\"y\",\"text\"]\n","train_df = train_df[cols]\n","from sklearn.model_selection import train_test_split\n","train_df, test_df = train_test_split(train_df, test_size=0.2)\n","train_df\n","\n"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/html":["<div>\n","<style scoped>\n","    .dataframe tbody tr th:only-of-type {\n","        vertical-align: middle;\n","    }\n","\n","    .dataframe tbody tr th {\n","        vertical-align: top;\n","    }\n","\n","    .dataframe thead th {\n","        text-align: right;\n","    }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n","  <thead>\n","    <tr style=\"text-align: right;\">\n","      <th></th>\n","      <th>y</th>\n","      <th>text</th>\n","    </tr>\n","  </thead>\n","  <tbody>\n","    <tr>\n","      <th>1458</th>\n","      <td>good</td>\n","      <td>Full of yellow fruits, ripe apples and soft ac...</td>\n","    </tr>\n","    <tr>\n","      <th>378</th>\n","      <td>acceptable</td>\n","      <td>Barnyard aromas atop berry scents make for a n...</td>\n","    </tr>\n","    <tr>\n","      <th>17</th>\n","      <td>very good</td>\n","      <td>An aromatic twist of passion fruit plays on th...</td>\n","    </tr>\n","    <tr>\n","      <th>2456</th>\n","      <td>very good</td>\n","      <td>Wood smoke and black pepper aromas start this ...</td>\n","    </tr>\n","    <tr>\n","      <th>2103</th>\n","      <td>best</td>\n","      <td>Talk about magnetic aromas of bacon, tobacco a...</td>\n","    </tr>\n","    <tr>\n","      <th>...</th>\n","      <td>...</td>\n","      <td>...</td>\n","    </tr>\n","    <tr>\n","      <th>1464</th>\n","      <td>acceptable</td>\n","      <td>The flint soil of the vineyard shows in the st...</td>\n","    </tr>\n","    <tr>\n","      <th>3942</th>\n","      <td>acceptable</td>\n","      <td>Fruit-forward and simple, with the sugared tas...</td>\n","    </tr>\n","    <tr>\n","      <th>1871</th>\n","      <td>very good</td>\n","      <td>Bryan Babcock makes plenty of more expensive P...</td>\n","    </tr>\n","    <tr>\n","      <th>1945</th>\n","      <td>good</td>\n","      <td>Berry and plum aromas are spicy and saucy, wit...</td>\n","    </tr>\n","    <tr>\n","      <th>1219</th>\n","      <td>good</td>\n","      <td>A kitchen-sink blend of nine different varieti...</td>\n","    </tr>\n","  </tbody>\n","</table>\n","<p>4048 rows × 2 columns</p>\n","</div>"],"text/plain":["               y                                               text\n","1458        good  Full of yellow fruits, ripe apples and soft ac...\n","378   acceptable  Barnyard aromas atop berry scents make for a n...\n","17     very good  An aromatic twist of passion fruit plays on th...\n","2456   very good  Wood smoke and black pepper aromas start this ...\n","2103        best  Talk about magnetic aromas of bacon, tobacco a...\n","...          ...                                                ...\n","1464  acceptable  The flint soil of the vineyard shows in the st...\n","3942  acceptable  Fruit-forward and simple, with the sugared tas...\n","1871   very good  Bryan Babcock makes plenty of more expensive P...\n","1945        good  Berry and plum aromas are spicy and saucy, wit...\n","1219        good  A kitchen-sink blend of nine different varieti...\n","\n","[4048 rows x 2 columns]"]},"metadata":{"tags":[]},"execution_count":3}]},{"cell_type":"markdown","metadata":{"id":"0296Om2C5anY"},"source":["# 3. Train Deep Learning Classifier using nlu.load('train.classifier')\n","\n","You dataset label column should be named 'y' and the feature column with text data should be named 'text'"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/","height":1000},"id":"3ZIPkRkWftBG","executionInfo":{"elapsed":132178,"status":"ok","timestamp":1620188768985,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"bc0a38f8-d374-48da-98d0-196505e56e66"},"source":["# load a trainable pipeline by specifying the train. prefix  and fit it on a datset with label and text columns\n","# Since there are no\n","\n","trainable_pipe = nlu.load('train.classifier')\n","fitted_pipe = trainable_pipe.fit(train_df.iloc[:50] )\n","\n","\n","# predict with the trainable pipeline on dataset and get predictions\n","preds = fitted_pipe.predict(train_df.iloc[:50] ,output_level='document')\n","preds"],"execution_count":null,"outputs":[{"output_type":"stream","text":["tfhub_use download started this may take some time.\n","Approximate size to download 923.7 MB\n","[OK!]\n","sentence_detector_dl download started this may take some time.\n","Approximate size to download 354.6 KB\n","[OK!]\n"],"name":"stdout"},{"output_type":"execute_result","data":{"text/html":["<div>\n","<style scoped>\n","    .dataframe tbody tr th:only-of-type {\n","        vertical-align: middle;\n","    }\n","\n","    .dataframe tbody tr th {\n","        vertical-align: top;\n","    }\n","\n","    .dataframe thead th {\n","        text-align: right;\n","    }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n","  <thead>\n","    <tr style=\"text-align: right;\">\n","      <th></th>\n","      <th>sentence_embedding_use</th>\n","      <th>origin_index</th>\n","      <th>trained_classifier_confidence_confidence</th>\n","      <th>document</th>\n","      <th>y</th>\n","      <th>sentence</th>\n","      <th>trained_classifier</th>\n","      <th>text</th>\n","    </tr>\n","  </thead>\n","  <tbody>\n","    <tr>\n","      <th>0</th>\n","      <td>[0.026885146275162697, -0.06771063804626465, 0...</td>\n","      <td>1458</td>\n","      <td>0.620346</td>\n","      <td>Full of yellow fruits, ripe apples and soft ac...</td>\n","      <td>good</td>\n","      <td>[Full of yellow fruits, ripe apples and soft a...</td>\n","      <td>good</td>\n","      <td>Full of yellow fruits, ripe apples and soft ac...</td>\n","    </tr>\n","    <tr>\n","      <th>1</th>\n","      <td>[0.04962018504738808, 0.0652838945388794, -0.0...</td>\n","      <td>378</td>\n","      <td>0.543338</td>\n","      <td>Barnyard aromas atop berry scents make for a n...</td>\n","      <td>acceptable</td>\n","      <td>[Barnyard aromas atop berry scents make for a ...</td>\n","      <td>good</td>\n","      <td>Barnyard aromas atop berry scents make for a n...</td>\n","    </tr>\n","    <tr>\n","      <th>2</th>\n","      <td>[0.017539022490382195, -0.010785154066979885, ...</td>\n","      <td>17</td>\n","      <td>0.607028</td>\n","      <td>An aromatic twist of passion fruit plays on th...</td>\n","      <td>very good</td>\n","      <td>[An aromatic twist of passion fruit plays on t...</td>\n","      <td>good</td>\n","      <td>An aromatic twist of passion fruit plays on th...</td>\n","    </tr>\n","    <tr>\n","      <th>3</th>\n","      <td>[0.016984855756163597, -0.010578665882349014, ...</td>\n","      <td>2456</td>\n","      <td>0.602528</td>\n","      <td>Wood smoke and black pepper aromas start this ...</td>\n","      <td>very good</td>\n","      <td>[Wood smoke and black pepper aromas start this...</td>\n","      <td>good</td>\n","      <td>Wood smoke and black pepper aromas start this ...</td>\n","    </tr>\n","    <tr>\n","      <th>4</th>\n","      <td>[0.02070983126759529, -0.05402781069278717, -0...</td>\n","      <td>2103</td>\n","      <td>0.591057</td>\n","      <td>Talk about magnetic aromas of bacon, tobacco a...</td>\n","      <td>best</td>\n","      <td>[Talk about magnetic aromas of bacon, tobacco ...</td>\n","      <td>good</td>\n","      <td>Talk about magnetic aromas of bacon, tobacco a...</td>\n","    </tr>\n","    <tr>\n","      <th>5</th>\n","      <td>[-0.008546369150280952, -8.575373067287728e-05...</td>\n","      <td>3448</td>\n","      <td>0.611285</td>\n","      <td>Smoky, oaky, charred flavors of savory plum an...</td>\n","      <td>good</td>\n","      <td>[Smoky, oaky, charred flavors of savory plum a...</td>\n","      <td>good</td>\n","      <td>Smoky, oaky, charred flavors of savory plum an...</td>\n","    </tr>\n","    <tr>\n","      <th>6</th>\n","      <td>[0.0020979971159249544, -0.059595026075839996,...</td>\n","      <td>1556</td>\n","      <td>0.600705</td>\n","      <td>Made entirely with Nero d'Avola, this offers a...</td>\n","      <td>good</td>\n","      <td>[Made entirely with Nero d'Avola, this offers ...</td>\n","      <td>good</td>\n","      <td>Made entirely with Nero d'Avola, this offers a...</td>\n","    </tr>\n","    <tr>\n","      <th>7</th>\n","      <td>[0.03075247071683407, -0.05452873930335045, -0...</td>\n","      <td>3453</td>\n","      <td>0.614901</td>\n","      <td>Remarkably strong cinnamon characterizes the n...</td>\n","      <td>good</td>\n","      <td>[Remarkably strong cinnamon characterizes the ...</td>\n","      <td>good</td>\n","      <td>Remarkably strong cinnamon characterizes the n...</td>\n","    </tr>\n","    <tr>\n","      <th>8</th>\n","      <td>[-0.005710848607122898, -0.054149989038705826,...</td>\n","      <td>3729</td>\n","      <td>0.568524</td>\n","      <td>A sensational bottle at this price, it's ripe ...</td>\n","      <td>very good</td>\n","      <td>[A sensational bottle at this price, it's ripe...</td>\n","      <td>good</td>\n","      <td>A sensational bottle at this price, it's ripe ...</td>\n","    </tr>\n","    <tr>\n","      <th>9</th>\n","      <td>[0.028862619772553444, -0.05910215526819229, -...</td>\n","      <td>996</td>\n","      <td>0.605391</td>\n","      <td>Produced from 25-year-old vines, this is a fre...</td>\n","      <td>good</td>\n","      <td>[Produced from 25-year-old vines, this is a fr...</td>\n","      <td>good</td>\n","      <td>Produced from 25-year-old vines, this is a fre...</td>\n","    </tr>\n","    <tr>\n","      <th>10</th>\n","      <td>[0.020464830100536346, -0.06415539979934692, -...</td>\n","      <td>1867</td>\n","      <td>0.566839</td>\n","      <td>Despite its power, this is so elegant, showing...</td>\n","      <td>best</td>\n","      <td>[Despite its power, this is so elegant, showin...</td>\n","      <td>good</td>\n","      <td>Despite its power, this is so elegant, showing...</td>\n","    </tr>\n","    <tr>\n","      <th>11</th>\n","      <td>[-0.005847534164786339, -0.040369633585214615,...</td>\n","      <td>1891</td>\n","      <td>0.557577</td>\n","      <td>So incredibly thick and sweet it's almost chew...</td>\n","      <td>best</td>\n","      <td>[So incredibly thick and sweet it's almost che...</td>\n","      <td>good</td>\n","      <td>So incredibly thick and sweet it's almost chew...</td>\n","    </tr>\n","    <tr>\n","      <th>12</th>\n","      <td>[0.005895282607525587, -0.02846178226172924, -...</td>\n","      <td>3542</td>\n","      <td>0.567143</td>\n","      <td>By nature AlbariÃ±o doesn't last long, and thi...</td>\n","      <td>acceptable</td>\n","      <td>[By nature AlbariÃ±o doesn't last long, and th...</td>\n","      <td>good</td>\n","      <td>By nature AlbariÃ±o doesn't last long, and thi...</td>\n","    </tr>\n","    <tr>\n","      <th>13</th>\n","      <td>[0.053103938698768616, -0.031248200684785843, ...</td>\n","      <td>756</td>\n","      <td>0.628370</td>\n","      <td>Made entirely from Cabernet Franc, this opens ...</td>\n","      <td>good</td>\n","      <td>[Made entirely from Cabernet Franc, this opens...</td>\n","      <td>good</td>\n","      <td>Made entirely from Cabernet Franc, this opens ...</td>\n","    </tr>\n","    <tr>\n","      <th>14</th>\n","      <td>[0.04724583402276039, -0.05935594439506531, -0...</td>\n","      <td>1456</td>\n","      <td>0.584150</td>\n","      <td>Peppery, herbal aromas are gritty and a touch ...</td>\n","      <td>acceptable</td>\n","      <td>[Peppery, herbal aromas are gritty and a touch...</td>\n","      <td>good</td>\n","      <td>Peppery, herbal aromas are gritty and a touch ...</td>\n","    </tr>\n","    <tr>\n","      <th>15</th>\n","      <td>[0.036762576550245285, -0.027270542457699776, ...</td>\n","      <td>423</td>\n","      <td>0.616689</td>\n","      <td>Wild berries, from elderberry to salmonberry t...</td>\n","      <td>good</td>\n","      <td>[Wild berries, from elderberry to salmonberry ...</td>\n","      <td>good</td>\n","      <td>Wild berries, from elderberry to salmonberry t...</td>\n","    </tr>\n","    <tr>\n","      <th>16</th>\n","      <td>[-0.011491759680211544, -0.08527237921953201, ...</td>\n","      <td>1760</td>\n","      <td>0.551139</td>\n","      <td>A serious, impressive wine produced from a sma...</td>\n","      <td>best</td>\n","      <td>[A serious, impressive wine produced from a sm...</td>\n","      <td>good</td>\n","      <td>A serious, impressive wine produced from a sma...</td>\n","    </tr>\n","    <tr>\n","      <th>17</th>\n","      <td>[-0.005252075847238302, -0.020571669563651085,...</td>\n","      <td>1671</td>\n","      <td>0.575260</td>\n","      <td>Right out of the bottle, this 100% Syrah delig...</td>\n","      <td>very good</td>\n","      <td>[Right out of the bottle, this 100% Syrah deli...</td>\n","      <td>good</td>\n","      <td>Right out of the bottle, this 100% Syrah delig...</td>\n","    </tr>\n","    <tr>\n","      <th>18</th>\n","      <td>[-0.025395192205905914, -0.060151971876621246,...</td>\n","      <td>4998</td>\n","      <td>0.578186</td>\n","      <td>Aromatic and lushly scented with rhubarb, stra...</td>\n","      <td>very good</td>\n","      <td>[Aromatic and lushly scented with rhubarb, str...</td>\n","      <td>good</td>\n","      <td>Aromatic and lushly scented with rhubarb, stra...</td>\n","    </tr>\n","    <tr>\n","      <th>19</th>\n","      <td>[0.026585254818201065, -0.03247511386871338, 0...</td>\n","      <td>2561</td>\n","      <td>0.602712</td>\n","      <td>A clean nose of citrus paves the way for flavo...</td>\n","      <td>acceptable</td>\n","      <td>[A clean nose of citrus paves the way for flav...</td>\n","      <td>good</td>\n","      <td>A clean nose of citrus paves the way for flavo...</td>\n","    </tr>\n","    <tr>\n","      <th>20</th>\n","      <td>[0.020379165187478065, -0.01716090366244316, -...</td>\n","      <td>3950</td>\n","      <td>0.564268</td>\n","      <td>Ripe blackberry aromas are a touch malty and j...</td>\n","      <td>very good</td>\n","      <td>[Ripe blackberry aromas are a touch malty and ...</td>\n","      <td>good</td>\n","      <td>Ripe blackberry aromas are a touch malty and j...</td>\n","    </tr>\n","    <tr>\n","      <th>21</th>\n","      <td>[-0.016609780490398407, -0.036731649190187454,...</td>\n","      <td>2699</td>\n","      <td>0.604877</td>\n","      <td>Balanced toward the bold side, it's full-bodie...</td>\n","      <td>good</td>\n","      <td>[Balanced toward the bold side, it's full-bodi...</td>\n","      <td>good</td>\n","      <td>Balanced toward the bold side, it's full-bodie...</td>\n","    </tr>\n","    <tr>\n","      <th>22</th>\n","      <td>[-0.015338313765823841, -0.044104497879743576,...</td>\n","      <td>3078</td>\n","      <td>0.581559</td>\n","      <td>Lifted notes of dried pear, dried chamomile fl...</td>\n","      <td>very good</td>\n","      <td>[Lifted notes of dried pear, dried chamomile f...</td>\n","      <td>good</td>\n","      <td>Lifted notes of dried pear, dried chamomile fl...</td>\n","    </tr>\n","    <tr>\n","      <th>23</th>\n","      <td>[-0.03071591444313526, -0.038018349558115005, ...</td>\n","      <td>3333</td>\n","      <td>0.570483</td>\n","      <td>With its sophisticated mix of mineral, acid an...</td>\n","      <td>very good</td>\n","      <td>[With its sophisticated mix of mineral, acid a...</td>\n","      <td>good</td>\n","      <td>With its sophisticated mix of mineral, acid an...</td>\n","    </tr>\n","    <tr>\n","      <th>24</th>\n","      <td>[0.06536413729190826, -0.04731231555342674, -0...</td>\n","      <td>550</td>\n","      <td>0.503100</td>\n","      <td>This is really a spectacular wine. It's hard t...</td>\n","      <td>best</td>\n","      <td>[This is really a spectacular wine., It's hard...</td>\n","      <td>good</td>\n","      <td>This is really a spectacular wine. It's hard t...</td>\n","    </tr>\n","    <tr>\n","      <th>25</th>\n","      <td>[0.03561289235949516, -0.03295084834098816, -0...</td>\n","      <td>3615</td>\n","      <td>0.576032</td>\n","      <td>Bruce McGuire makes a strong case for the pote...</td>\n","      <td>very good</td>\n","      <td>[Bruce McGuire makes a strong case for the pot...</td>\n","      <td>good</td>\n","      <td>Bruce McGuire makes a strong case for the pote...</td>\n","    </tr>\n","    <tr>\n","      <th>26</th>\n","      <td>[0.017156174406409264, -0.06948413699865341, -...</td>\n","      <td>1422</td>\n","      <td>0.570740</td>\n","      <td>This firm, tannic wine is made from fruit sour...</td>\n","      <td>best</td>\n","      <td>[This firm, tannic wine is made from fruit sou...</td>\n","      <td>good</td>\n","      <td>This firm, tannic wine is made from fruit sour...</td>\n","    </tr>\n","    <tr>\n","      <th>27</th>\n","      <td>[0.014733278192579746, 0.006120753940194845, -...</td>\n","      <td>2824</td>\n","      <td>0.553367</td>\n","      <td>It's odd to find reductive elements on a white...</td>\n","      <td>acceptable</td>\n","      <td>[It's odd to find reductive elements on a whit...</td>\n","      <td>good</td>\n","      <td>It's odd to find reductive elements on a white...</td>\n","    </tr>\n","    <tr>\n","      <th>28</th>\n","      <td>[-0.02304786629974842, -0.058774229139089584, ...</td>\n","      <td>4935</td>\n","      <td>0.537677</td>\n","      <td>With the 2009 vintage, De Loach is at the top ...</td>\n","      <td>best</td>\n","      <td>[With the 2009 vintage, De Loach is at the top...</td>\n","      <td>good</td>\n","      <td>With the 2009 vintage, De Loach is at the top ...</td>\n","    </tr>\n","    <tr>\n","      <th>29</th>\n","      <td>[0.06899003684520721, -0.043633416295051575, -...</td>\n","      <td>2365</td>\n","      <td>0.603090</td>\n","      <td>This nose is rather closed but the firm palate...</td>\n","      <td>good</td>\n","      <td>[This nose is rather closed but the firm palat...</td>\n","      <td>good</td>\n","      <td>This nose is rather closed but the firm palate...</td>\n","    </tr>\n","    <tr>\n","      <th>30</th>\n","      <td>[-0.035565443336963654, -0.04602222889661789, ...</td>\n","      <td>2754</td>\n","      <td>0.619796</td>\n","      <td>Enticing wildflower, chopped herb and ripe orc...</td>\n","      <td>very good</td>\n","      <td>[Enticing wildflower, chopped herb and ripe or...</td>\n","      <td>good</td>\n","      <td>Enticing wildflower, chopped herb and ripe orc...</td>\n","    </tr>\n","    <tr>\n","      <th>31</th>\n","      <td>[-0.021357573568820953, -0.07660140097141266, ...</td>\n","      <td>955</td>\n","      <td>0.600876</td>\n","      <td>Fresh and juicy, this full-bodied while struct...</td>\n","      <td>good</td>\n","      <td>[Fresh and juicy, this full-bodied while struc...</td>\n","      <td>good</td>\n","      <td>Fresh and juicy, this full-bodied while struct...</td>\n","    </tr>\n","    <tr>\n","      <th>32</th>\n","      <td>[-0.008119497448205948, -0.022551201283931732,...</td>\n","      <td>1632</td>\n","      <td>0.606923</td>\n","      <td>The best of the winery's 2011 block reserves, ...</td>\n","      <td>good</td>\n","      <td>[The best of the winery's 2011 block reserves,...</td>\n","      <td>good</td>\n","      <td>The best of the winery's 2011 block reserves, ...</td>\n","    </tr>\n","    <tr>\n","      <th>33</th>\n","      <td>[-0.010882342234253883, -0.06637204438447952, ...</td>\n","      <td>3025</td>\n","      <td>0.584699</td>\n","      <td>A blend of fruit from three Rogue Valley sites...</td>\n","      <td>good</td>\n","      <td>[A blend of fruit from three Rogue Valley site...</td>\n","      <td>good</td>\n","      <td>A blend of fruit from three Rogue Valley sites...</td>\n","    </tr>\n","    <tr>\n","      <th>34</th>\n","      <td>[0.016116805374622345, -0.06279352307319641, -...</td>\n","      <td>3763</td>\n","      <td>0.549124</td>\n","      <td>Exceeds even this producer's stunning beerenau...</td>\n","      <td>best</td>\n","      <td>[Exceeds even this producer's stunning beerena...</td>\n","      <td>good</td>\n","      <td>Exceeds even this producer's stunning beerenau...</td>\n","    </tr>\n","    <tr>\n","      <th>35</th>\n","      <td>[0.0413261242210865, -0.027950072661042213, -0...</td>\n","      <td>695</td>\n","      <td>0.532788</td>\n","      <td>The palate opens slowly, offering an initial c...</td>\n","      <td>best</td>\n","      <td>[The palate opens slowly, offering an initial ...</td>\n","      <td>good</td>\n","      <td>The palate opens slowly, offering an initial c...</td>\n","    </tr>\n","    <tr>\n","      <th>36</th>\n","      <td>[-0.04491781070828438, -0.016601886600255966, ...</td>\n","      <td>1196</td>\n","      <td>0.597868</td>\n","      <td>This is pretty pale for a Tavel, with a copper...</td>\n","      <td>good</td>\n","      <td>[This is pretty pale for a Tavel, with a coppe...</td>\n","      <td>good</td>\n","      <td>This is pretty pale for a Tavel, with a copper...</td>\n","    </tr>\n","    <tr>\n","      <th>37</th>\n","      <td>[0.04148593544960022, -0.005966056603938341, -...</td>\n","      <td>4338</td>\n","      <td>0.559302</td>\n","      <td>From the producer's Yountville vineyard, as we...</td>\n","      <td>acceptable</td>\n","      <td>[From the producer's Yountville vineyard, as w...</td>\n","      <td>good</td>\n","      <td>From the producer's Yountville vineyard, as we...</td>\n","    </tr>\n","    <tr>\n","      <th>38</th>\n","      <td>[-0.044102396816015244, -0.021978359669446945,...</td>\n","      <td>3514</td>\n","      <td>0.587864</td>\n","      <td>Fruit from the oldest blocks in this pioneerin...</td>\n","      <td>very good</td>\n","      <td>[Fruit from the oldest blocks in this pioneeri...</td>\n","      <td>good</td>\n","      <td>Fruit from the oldest blocks in this pioneerin...</td>\n","    </tr>\n","    <tr>\n","      <th>39</th>\n","      <td>[-0.01988844946026802, -0.07353201508522034, -...</td>\n","      <td>2551</td>\n","      <td>0.579697</td>\n","      <td>More people would crave MourvÃ¨dre if it were ...</td>\n","      <td>very good</td>\n","      <td>[More people would crave MourvÃ¨dre if it were...</td>\n","      <td>good</td>\n","      <td>More people would crave MourvÃ¨dre if it were ...</td>\n","    </tr>\n","    <tr>\n","      <th>40</th>\n","      <td>[0.03592630848288536, -0.0763244703412056, -0....</td>\n","      <td>4793</td>\n","      <td>0.586441</td>\n","      <td>Seduction may be an odd word to use for a Malb...</td>\n","      <td>good</td>\n","      <td>[Seduction may be an odd word to use for a Mal...</td>\n","      <td>good</td>\n","      <td>Seduction may be an odd word to use for a Malb...</td>\n","    </tr>\n","    <tr>\n","      <th>41</th>\n","      <td>[-0.027629075571894646, -0.02934451773762703, ...</td>\n","      <td>2996</td>\n","      <td>0.625336</td>\n","      <td>Oceanic aromas of grass, scallion, baby garlic...</td>\n","      <td>good</td>\n","      <td>[Oceanic aromas of grass, scallion, baby garli...</td>\n","      <td>good</td>\n","      <td>Oceanic aromas of grass, scallion, baby garlic...</td>\n","    </tr>\n","    <tr>\n","      <th>42</th>\n","      <td>[0.017115404829382896, -0.032828159630298615, ...</td>\n","      <td>706</td>\n","      <td>0.581963</td>\n","      <td>As expected, the wine exhibits a dense black c...</td>\n","      <td>best</td>\n","      <td>[As expected, the wine exhibits a dense black ...</td>\n","      <td>good</td>\n","      <td>As expected, the wine exhibits a dense black c...</td>\n","    </tr>\n","    <tr>\n","      <th>43</th>\n","      <td>[0.030317923054099083, -0.07067029923200607, 0...</td>\n","      <td>4033</td>\n","      <td>0.606539</td>\n","      <td>A mix of ripe black fruits and dense tannins c...</td>\n","      <td>good</td>\n","      <td>[A mix of ripe black fruits and dense tannins ...</td>\n","      <td>good</td>\n","      <td>A mix of ripe black fruits and dense tannins c...</td>\n","    </tr>\n","    <tr>\n","      <th>44</th>\n","      <td>[-0.018614105880260468, -0.05517773702740669, ...</td>\n","      <td>1542</td>\n","      <td>0.529212</td>\n","      <td>This lightly wood-aged wine, from an estate th...</td>\n","      <td>very good</td>\n","      <td>[This lightly wood-aged wine, from an estate t...</td>\n","      <td>good</td>\n","      <td>This lightly wood-aged wine, from an estate th...</td>\n","    </tr>\n","    <tr>\n","      <th>45</th>\n","      <td>[0.01225570309907198, -0.03293222188949585, -0...</td>\n","      <td>2927</td>\n","      <td>0.590190</td>\n","      <td>With tannins and new-wood flavors, this rich w...</td>\n","      <td>good</td>\n","      <td>[With tannins and new-wood flavors, this rich ...</td>\n","      <td>good</td>\n","      <td>With tannins and new-wood flavors, this rich w...</td>\n","    </tr>\n","    <tr>\n","      <th>46</th>\n","      <td>[0.023177076131105423, -0.040385812520980835, ...</td>\n","      <td>3202</td>\n","      <td>0.592817</td>\n","      <td>Plummy chocolate stars in this densely texture...</td>\n","      <td>good</td>\n","      <td>[Plummy chocolate stars in this densely textur...</td>\n","      <td>good</td>\n","      <td>Plummy chocolate stars in this densely texture...</td>\n","    </tr>\n","    <tr>\n","      <th>47</th>\n","      <td>[-0.0009934792760759592, -0.0660802349448204, ...</td>\n","      <td>3176</td>\n","      <td>0.615645</td>\n","      <td>Cidery aromas vie with bready notes to give th...</td>\n","      <td>good</td>\n","      <td>[Cidery aromas vie with bready notes to give t...</td>\n","      <td>good</td>\n","      <td>Cidery aromas vie with bready notes to give th...</td>\n","    </tr>\n","    <tr>\n","      <th>48</th>\n","      <td>[-0.055515553802251816, 0.007777589838951826, ...</td>\n","      <td>646</td>\n","      <td>0.594357</td>\n","      <td>Salted apples and white rocks are marred by me...</td>\n","      <td>acceptable</td>\n","      <td>[Salted apples and white rocks are marred by m...</td>\n","      <td>good</td>\n","      <td>Salted apples and white rocks are marred by me...</td>\n","    </tr>\n","    <tr>\n","      <th>49</th>\n","      <td>[0.03407430648803711, -0.06791502982378006, -0...</td>\n","      <td>3992</td>\n","      <td>0.560114</td>\n","      <td>The minty aromas indicate new wood aging as we...</td>\n","      <td>very good</td>\n","      <td>[The minty aromas indicate new wood aging as w...</td>\n","      <td>good</td>\n","      <td>The minty aromas indicate new wood aging as we...</td>\n","    </tr>\n","  </tbody>\n","</table>\n","</div>"],"text/plain":["                               sentence_embedding_use  ...                                               text\n","0   [0.026885146275162697, -0.06771063804626465, 0...  ...  Full of yellow fruits, ripe apples and soft ac...\n","1   [0.04962018504738808, 0.0652838945388794, -0.0...  ...  Barnyard aromas atop berry scents make for a n...\n","2   [0.017539022490382195, -0.010785154066979885, ...  ...  An aromatic twist of passion fruit plays on th...\n","3   [0.016984855756163597, -0.010578665882349014, ...  ...  Wood smoke and black pepper aromas start this ...\n","4   [0.02070983126759529, -0.05402781069278717, -0...  ...  Talk about magnetic aromas of bacon, tobacco a...\n","5   [-0.008546369150280952, -8.575373067287728e-05...  ...  Smoky, oaky, charred flavors of savory plum an...\n","6   [0.0020979971159249544, -0.059595026075839996,...  ...  Made entirely with Nero d'Avola, this offers a...\n","7   [0.03075247071683407, -0.05452873930335045, -0...  ...  Remarkably strong cinnamon characterizes the n...\n","8   [-0.005710848607122898, -0.054149989038705826,...  ...  A sensational bottle at this price, it's ripe ...\n","9   [0.028862619772553444, -0.05910215526819229, -...  ...  Produced from 25-year-old vines, this is a fre...\n","10  [0.020464830100536346, -0.06415539979934692, -...  ...  Despite its power, this is so elegant, showing...\n","11  [-0.005847534164786339, -0.040369633585214615,...  ...  So incredibly thick and sweet it's almost chew...\n","12  [0.005895282607525587, -0.02846178226172924, -...  ...  By nature AlbariÃ±o doesn't last long, and thi...\n","13  [0.053103938698768616, -0.031248200684785843, ...  ...  Made entirely from Cabernet Franc, this opens ...\n","14  [0.04724583402276039, -0.05935594439506531, -0...  ...  Peppery, herbal aromas are gritty and a touch ...\n","15  [0.036762576550245285, -0.027270542457699776, ...  ...  Wild berries, from elderberry to salmonberry t...\n","16  [-0.011491759680211544, -0.08527237921953201, ...  ...  A serious, impressive wine produced from a sma...\n","17  [-0.005252075847238302, -0.020571669563651085,...  ...  Right out of the bottle, this 100% Syrah delig...\n","18  [-0.025395192205905914, -0.060151971876621246,...  ...  Aromatic and lushly scented with rhubarb, stra...\n","19  [0.026585254818201065, -0.03247511386871338, 0...  ...  A clean nose of citrus paves the way for flavo...\n","20  [0.020379165187478065, -0.01716090366244316, -...  ...  Ripe blackberry aromas are a touch malty and j...\n","21  [-0.016609780490398407, -0.036731649190187454,...  ...  Balanced toward the bold side, it's full-bodie...\n","22  [-0.015338313765823841, -0.044104497879743576,...  ...  Lifted notes of dried pear, dried chamomile fl...\n","23  [-0.03071591444313526, -0.038018349558115005, ...  ...  With its sophisticated mix of mineral, acid an...\n","24  [0.06536413729190826, -0.04731231555342674, -0...  ...  This is really a spectacular wine. It's hard t...\n","25  [0.03561289235949516, -0.03295084834098816, -0...  ...  Bruce McGuire makes a strong case for the pote...\n","26  [0.017156174406409264, -0.06948413699865341, -...  ...  This firm, tannic wine is made from fruit sour...\n","27  [0.014733278192579746, 0.006120753940194845, -...  ...  It's odd to find reductive elements on a white...\n","28  [-0.02304786629974842, -0.058774229139089584, ...  ...  With the 2009 vintage, De Loach is at the top ...\n","29  [0.06899003684520721, -0.043633416295051575, -...  ...  This nose is rather closed but the firm palate...\n","30  [-0.035565443336963654, -0.04602222889661789, ...  ...  Enticing wildflower, chopped herb and ripe orc...\n","31  [-0.021357573568820953, -0.07660140097141266, ...  ...  Fresh and juicy, this full-bodied while struct...\n","32  [-0.008119497448205948, -0.022551201283931732,...  ...  The best of the winery's 2011 block reserves, ...\n","33  [-0.010882342234253883, -0.06637204438447952, ...  ...  A blend of fruit from three Rogue Valley sites...\n","34  [0.016116805374622345, -0.06279352307319641, -...  ...  Exceeds even this producer's stunning beerenau...\n","35  [0.0413261242210865, -0.027950072661042213, -0...  ...  The palate opens slowly, offering an initial c...\n","36  [-0.04491781070828438, -0.016601886600255966, ...  ...  This is pretty pale for a Tavel, with a copper...\n","37  [0.04148593544960022, -0.005966056603938341, -...  ...  From the producer's Yountville vineyard, as we...\n","38  [-0.044102396816015244, -0.021978359669446945,...  ...  Fruit from the oldest blocks in this pioneerin...\n","39  [-0.01988844946026802, -0.07353201508522034, -...  ...  More people would crave MourvÃ¨dre if it were ...\n","40  [0.03592630848288536, -0.0763244703412056, -0....  ...  Seduction may be an odd word to use for a Malb...\n","41  [-0.027629075571894646, -0.02934451773762703, ...  ...  Oceanic aromas of grass, scallion, baby garlic...\n","42  [0.017115404829382896, -0.032828159630298615, ...  ...  As expected, the wine exhibits a dense black c...\n","43  [0.030317923054099083, -0.07067029923200607, 0...  ...  A mix of ripe black fruits and dense tannins c...\n","44  [-0.018614105880260468, -0.05517773702740669, ...  ...  This lightly wood-aged wine, from an estate th...\n","45  [0.01225570309907198, -0.03293222188949585, -0...  ...  With tannins and new-wood flavors, this rich w...\n","46  [0.023177076131105423, -0.040385812520980835, ...  ...  Plummy chocolate stars in this densely texture...\n","47  [-0.0009934792760759592, -0.0660802349448204, ...  ...  Cidery aromas vie with bready notes to give th...\n","48  [-0.055515553802251816, 0.007777589838951826, ...  ...  Salted apples and white rocks are marred by me...\n","49  [0.03407430648803711, -0.06791502982378006, -0...  ...  The minty aromas indicate new wood aging as we...\n","\n","[50 rows x 8 columns]"]},"metadata":{"tags":[]},"execution_count":4}]},{"cell_type":"markdown","metadata":{"id":"lVyOE2wV0fw_"},"source":["# 4. Test the fitted pipe on new example"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/","height":80},"id":"qdCUg2MR0PD2","executionInfo":{"elapsed":132735,"status":"ok","timestamp":1620188769564,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"c6e2d61a-5884-4739-cd8d-e47f51adee25"},"source":["fitted_pipe.predict('It was one of the best wines i ever tasted .')"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/html":["<div>\n","<style scoped>\n","    .dataframe tbody tr th:only-of-type {\n","        vertical-align: middle;\n","    }\n","\n","    .dataframe tbody tr th {\n","        vertical-align: top;\n","    }\n","\n","    .dataframe thead th {\n","        text-align: right;\n","    }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n","  <thead>\n","    <tr style=\"text-align: right;\">\n","      <th></th>\n","      <th>sentence_embedding_use</th>\n","      <th>origin_index</th>\n","      <th>trained_classifier_confidence_confidence</th>\n","      <th>document</th>\n","      <th>sentence</th>\n","      <th>trained_classifier</th>\n","    </tr>\n","  </thead>\n","  <tbody>\n","    <tr>\n","      <th>0</th>\n","      <td>[0.0249565988779068, 0.02628515101969242, -0.0...</td>\n","      <td>0</td>\n","      <td>0.529498</td>\n","      <td>It was one of the best wines i ever tasted .</td>\n","      <td>[It was one of the best wines i ever tasted .]</td>\n","      <td>good</td>\n","    </tr>\n","  </tbody>\n","</table>\n","</div>"],"text/plain":["                              sentence_embedding_use  ...  trained_classifier\n","0  [0.0249565988779068, 0.02628515101969242, -0.0...  ...                good\n","\n","[1 rows x 6 columns]"]},"metadata":{"tags":[]},"execution_count":5}]},{"cell_type":"markdown","metadata":{"id":"xflpwrVjjBVD"},"source":["## 5. Configure pipe training parameters"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"UtsAUGTmOTms","executionInfo":{"elapsed":133127,"status":"ok","timestamp":1620188769982,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"b6e8a835-2cec-4ed9-d3f5-3ed1707e861f"},"source":["trainable_pipe.print_info()"],"execution_count":null,"outputs":[{"output_type":"stream","text":["The following parameters are configurable for this NLU pipeline (You can copy paste the examples) :\n",">>> pipe['classifier_dl'] has settable params:\n","pipe['classifier_dl'].setMaxEpochs(3)                | Info: Maximum number of epochs to train | Currently set to : 3\n","pipe['classifier_dl'].setLr(0.005)                   | Info: Learning Rate | Currently set to : 0.005\n","pipe['classifier_dl'].setBatchSize(64)               | Info: Batch size | Currently set to : 64\n","pipe['classifier_dl'].setDropout(0.5)                | Info: Dropout coefficient | Currently set to : 0.5\n","pipe['classifier_dl'].setEnableOutputLogs(True)      | Info: Whether to use stdout in addition to Spark logs. | Currently set to : True\n",">>> pipe['use@tfhub_use'] has settable params:\n","pipe['use@tfhub_use'].setDimension(512)              | Info: Number of embedding dimensions | Currently set to : 512\n","pipe['use@tfhub_use'].setLoadSP(False)               | Info: Whether to load SentencePiece ops file which is required only by multi-lingual models. This is not changeable after it's set with a pretrained model nor it is compatible with Windows. | Currently set to : False\n","pipe['use@tfhub_use'].setStorageRef('tfhub_use')     | Info: unique reference name for identification | Currently set to : tfhub_use\n",">>> pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'] has settable params:\n","pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setExplodeSentences(False)  | Info: whether to explode each sentence into a different row, for better parallelization. Defaults to false. | Currently set to : False\n","pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setStorageRef('SentenceDetectorDLModel_c83c27f46b97')  | Info: storage unique identifier | Currently set to : SentenceDetectorDLModel_c83c27f46b97\n","pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setEncoder(com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@587bdb2f)  | Info: Data encoder | Currently set to : com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@587bdb2f\n","pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setImpossiblePenultimates(['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e'])  | Info: Impossible penultimates | Currently set to : ['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e']\n","pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setModelArchitecture('cnn')  | Info: Model architecture (CNN) | Currently set to : cnn\n",">>> pipe['document_assembler'] has settable params:\n","pipe['document_assembler'].setCleanupMode('shrink')  | Info: possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full | Currently set to : shrink\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"2GJdDNV9jEIe"},"source":["## 6. Retrain with new parameters"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/","height":793},"id":"mptfvHx-MMMX","executionInfo":{"elapsed":146378,"status":"ok","timestamp":1620188783252,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"94fb762e-f58f-47a5-f016-50d3f21e3c3a"},"source":["# Train longer!\n","trainable_pipe = nlu.load('train.classifier')\n","trainable_pipe['trainable_classifier_dl'].setMaxEpochs(5)  \n","fitted_pipe = trainable_pipe.fit(train_df.iloc[:100])\n","# predict with the trainable pipeline on dataset and get predictions\n","preds = fitted_pipe.predict(train_df.iloc[:100],output_level='document')\n","\n","#sentence detector that is part of the pipe generates sone NaNs. lets drop them first\n","preds.dropna(inplace=True)\n","from sklearn.metrics import classification_report\n","print(classification_report(preds['y'], preds['classifier_dl']))\n","preds"],"execution_count":null,"outputs":[{"output_type":"stream","text":["              precision    recall  f1-score   support\n","\n","  acceptable       0.00      0.00      0.00        20\n","        best       0.00      0.00      0.00        20\n","        good       0.00      0.00      0.00        31\n","   very good       0.29      1.00      0.45        29\n","\n","    accuracy                           0.29       100\n","   macro avg       0.07      0.25      0.11       100\n","weighted avg       0.08      0.29      0.13       100\n","\n"],"name":"stdout"},{"output_type":"execute_result","data":{"text/html":["<div>\n","<style scoped>\n","    .dataframe tbody tr th:only-of-type {\n","        vertical-align: middle;\n","    }\n","\n","    .dataframe tbody tr th {\n","        vertical-align: top;\n","    }\n","\n","    .dataframe thead th {\n","        text-align: right;\n","    }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n","  <thead>\n","    <tr style=\"text-align: right;\">\n","      <th></th>\n","      <th>sentence_embedding_use</th>\n","      <th>origin_index</th>\n","      <th>trained_classifier_confidence_confidence</th>\n","      <th>document</th>\n","      <th>y</th>\n","      <th>sentence</th>\n","      <th>trained_classifier</th>\n","      <th>text</th>\n","    </tr>\n","  </thead>\n","  <tbody>\n","    <tr>\n","      <th>0</th>\n","      <td>[0.026885146275162697, -0.06771063804626465, 0...</td>\n","      <td>1458</td>\n","      <td>0.393935</td>\n","      <td>Full of yellow fruits, ripe apples and soft ac...</td>\n","      <td>good</td>\n","      <td>[Full of yellow fruits, ripe apples and soft a...</td>\n","      <td>very good</td>\n","      <td>Full of yellow fruits, ripe apples and soft ac...</td>\n","    </tr>\n","    <tr>\n","      <th>1</th>\n","      <td>[0.04962018504738808, 0.0652838945388794, -0.0...</td>\n","      <td>378</td>\n","      <td>0.395312</td>\n","      <td>Barnyard aromas atop berry scents make for a n...</td>\n","      <td>acceptable</td>\n","      <td>[Barnyard aromas atop berry scents make for a ...</td>\n","      <td>very good</td>\n","      <td>Barnyard aromas atop berry scents make for a n...</td>\n","    </tr>\n","    <tr>\n","      <th>2</th>\n","      <td>[0.017539022490382195, -0.010785154066979885, ...</td>\n","      <td>17</td>\n","      <td>0.567956</td>\n","      <td>An aromatic twist of passion fruit plays on th...</td>\n","      <td>very good</td>\n","      <td>[An aromatic twist of passion fruit plays on t...</td>\n","      <td>very good</td>\n","      <td>An aromatic twist of passion fruit plays on th...</td>\n","    </tr>\n","    <tr>\n","      <th>3</th>\n","      <td>[0.016984855756163597, -0.010578665882349014, ...</td>\n","      <td>2456</td>\n","      <td>0.577995</td>\n","      <td>Wood smoke and black pepper aromas start this ...</td>\n","      <td>very good</td>\n","      <td>[Wood smoke and black pepper aromas start this...</td>\n","      <td>very good</td>\n","      <td>Wood smoke and black pepper aromas start this ...</td>\n","    </tr>\n","    <tr>\n","      <th>4</th>\n","      <td>[0.02070983126759529, -0.05402781069278717, -0...</td>\n","      <td>2103</td>\n","      <td>0.526734</td>\n","      <td>Talk about magnetic aromas of bacon, tobacco a...</td>\n","      <td>best</td>\n","      <td>[Talk about magnetic aromas of bacon, tobacco ...</td>\n","      <td>very good</td>\n","      <td>Talk about magnetic aromas of bacon, tobacco a...</td>\n","    </tr>\n","    <tr>\n","      <th>...</th>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","      <td>...</td>\n","    </tr>\n","    <tr>\n","      <th>95</th>\n","      <td>[0.05499161034822464, -0.04508962109684944, -0...</td>\n","      <td>392</td>\n","      <td>0.437371</td>\n","      <td>This lightly aromatic wine offers notes of her...</td>\n","      <td>acceptable</td>\n","      <td>[This lightly aromatic wine offers notes of he...</td>\n","      <td>very good</td>\n","      <td>This lightly aromatic wine offers notes of her...</td>\n","    </tr>\n","    <tr>\n","      <th>96</th>\n","      <td>[0.03258365020155907, -0.036026667803525925, -...</td>\n","      <td>1376</td>\n","      <td>0.384945</td>\n","      <td>Brilliant aromatics here, just stupendously at...</td>\n","      <td>best</td>\n","      <td>[Brilliant aromatics here, just stupendously a...</td>\n","      <td>very good</td>\n","      <td>Brilliant aromatics here, just stupendously at...</td>\n","    </tr>\n","    <tr>\n","      <th>97</th>\n","      <td>[0.004543728660792112, -0.06775235384702682, -...</td>\n","      <td>3182</td>\n","      <td>0.405291</td>\n","      <td>This sparkling wine is intense with startling ...</td>\n","      <td>acceptable</td>\n","      <td>[This sparkling wine is intense with startling...</td>\n","      <td>very good</td>\n","      <td>This sparkling wine is intense with startling ...</td>\n","    </tr>\n","    <tr>\n","      <th>98</th>\n","      <td>[0.0255692508071661, 0.04986872524023056, -0.0...</td>\n","      <td>3709</td>\n","      <td>0.456385</td>\n","      <td>The nose is like a veil of Golden Delicious ap...</td>\n","      <td>very good</td>\n","      <td>[The nose is like a veil of Golden Delicious a...</td>\n","      <td>very good</td>\n","      <td>The nose is like a veil of Golden Delicious ap...</td>\n","    </tr>\n","    <tr>\n","      <th>99</th>\n","      <td>[-0.0047778417356312275, -0.05661110579967499,...</td>\n","      <td>4389</td>\n","      <td>0.399565</td>\n","      <td>This wine is smooth and ripe, with soft tannin...</td>\n","      <td>good</td>\n","      <td>[This wine is smooth and ripe, with soft tanni...</td>\n","      <td>very good</td>\n","      <td>This wine is smooth and ripe, with soft tannin...</td>\n","    </tr>\n","  </tbody>\n","</table>\n","<p>100 rows × 8 columns</p>\n","</div>"],"text/plain":["                               sentence_embedding_use  ...                                               text\n","0   [0.026885146275162697, -0.06771063804626465, 0...  ...  Full of yellow fruits, ripe apples and soft ac...\n","1   [0.04962018504738808, 0.0652838945388794, -0.0...  ...  Barnyard aromas atop berry scents make for a n...\n","2   [0.017539022490382195, -0.010785154066979885, ...  ...  An aromatic twist of passion fruit plays on th...\n","3   [0.016984855756163597, -0.010578665882349014, ...  ...  Wood smoke and black pepper aromas start this ...\n","4   [0.02070983126759529, -0.05402781069278717, -0...  ...  Talk about magnetic aromas of bacon, tobacco a...\n","..                                                ...  ...                                                ...\n","95  [0.05499161034822464, -0.04508962109684944, -0...  ...  This lightly aromatic wine offers notes of her...\n","96  [0.03258365020155907, -0.036026667803525925, -...  ...  Brilliant aromatics here, just stupendously at...\n","97  [0.004543728660792112, -0.06775235384702682, -...  ...  This sparkling wine is intense with startling ...\n","98  [0.0255692508071661, 0.04986872524023056, -0.0...  ...  The nose is like a veil of Golden Delicious ap...\n","99  [-0.0047778417356312275, -0.05661110579967499,...  ...  This wine is smooth and ripe, with soft tannin...\n","\n","[100 rows x 8 columns]"]},"metadata":{"tags":[]},"execution_count":7}]},{"cell_type":"markdown","metadata":{"id":"qFoT-s1MjTSS"},"source":["# 7. Try training with different Embeddings"]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"nxWFzQOhjWC8","executionInfo":{"elapsed":146359,"status":"ok","timestamp":1620188783253,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"0991c8be-f22e-4d35-83be-6bcbe946d57e"},"source":["# We can use nlu.print_components(action='embed_sentence') to see every possibler sentence embedding we could use. Lets use bert!\n","nlu.print_components(action='embed_sentence')"],"execution_count":null,"outputs":[{"output_type":"stream","text":["For language <en> NLU provides the following Models : \n","nlu.load('en.embed_sentence') returns Spark NLP model tfhub_use\n","nlu.load('en.embed_sentence.use') returns Spark NLP model tfhub_use\n","nlu.load('en.embed_sentence.tfhub_use') returns Spark NLP model tfhub_use\n","nlu.load('en.embed_sentence.use.lg') returns Spark NLP model tfhub_use_lg\n","nlu.load('en.embed_sentence.tfhub_use.lg') returns Spark NLP model tfhub_use_lg\n","nlu.load('en.embed_sentence.albert') returns Spark NLP model albert_base_uncased\n","nlu.load('en.embed_sentence.electra') returns Spark NLP model sent_electra_small_uncased\n","nlu.load('en.embed_sentence.electra_small_uncased') returns Spark NLP model sent_electra_small_uncased\n","nlu.load('en.embed_sentence.electra_base_uncased') returns Spark NLP model sent_electra_base_uncased\n","nlu.load('en.embed_sentence.electra_large_uncased') returns Spark NLP model sent_electra_large_uncased\n","nlu.load('en.embed_sentence.bert') returns Spark NLP model sent_bert_base_uncased\n","nlu.load('en.embed_sentence.bert_base_uncased') returns Spark NLP model sent_bert_base_uncased\n","nlu.load('en.embed_sentence.bert_base_cased') returns Spark NLP model sent_bert_base_cased\n","nlu.load('en.embed_sentence.bert_large_uncased') returns Spark NLP model sent_bert_large_uncased\n","nlu.load('en.embed_sentence.bert_large_cased') returns Spark NLP model sent_bert_large_cased\n","nlu.load('en.embed_sentence.biobert.pubmed_base_cased') returns Spark NLP model sent_biobert_pubmed_base_cased\n","nlu.load('en.embed_sentence.biobert.pubmed_large_cased') returns Spark NLP model sent_biobert_pubmed_large_cased\n","nlu.load('en.embed_sentence.biobert.pmc_base_cased') returns Spark NLP model sent_biobert_pmc_base_cased\n","nlu.load('en.embed_sentence.biobert.pubmed_pmc_base_cased') returns Spark NLP model sent_biobert_pubmed_pmc_base_cased\n","nlu.load('en.embed_sentence.biobert.clinical_base_cased') returns Spark NLP model sent_biobert_clinical_base_cased\n","nlu.load('en.embed_sentence.biobert.discharge_base_cased') returns Spark NLP model sent_biobert_discharge_base_cased\n","nlu.load('en.embed_sentence.covidbert.large_uncased') returns Spark NLP model sent_covidbert_large_uncased\n","nlu.load('en.embed_sentence.small_bert_L2_128') returns Spark NLP model sent_small_bert_L2_128\n","nlu.load('en.embed_sentence.small_bert_L4_128') returns Spark NLP model sent_small_bert_L4_128\n","nlu.load('en.embed_sentence.small_bert_L6_128') returns Spark NLP model sent_small_bert_L6_128\n","nlu.load('en.embed_sentence.small_bert_L8_128') returns Spark NLP model sent_small_bert_L8_128\n","nlu.load('en.embed_sentence.small_bert_L10_128') returns Spark NLP model sent_small_bert_L10_128\n","nlu.load('en.embed_sentence.small_bert_L12_128') returns Spark NLP model sent_small_bert_L12_128\n","nlu.load('en.embed_sentence.small_bert_L2_256') returns Spark NLP model sent_small_bert_L2_256\n","nlu.load('en.embed_sentence.small_bert_L4_256') returns Spark NLP model sent_small_bert_L4_256\n","nlu.load('en.embed_sentence.small_bert_L6_256') returns Spark NLP model sent_small_bert_L6_256\n","nlu.load('en.embed_sentence.small_bert_L8_256') returns Spark NLP model sent_small_bert_L8_256\n","nlu.load('en.embed_sentence.small_bert_L10_256') returns Spark NLP model sent_small_bert_L10_256\n","nlu.load('en.embed_sentence.small_bert_L12_256') returns Spark NLP model sent_small_bert_L12_256\n","nlu.load('en.embed_sentence.small_bert_L2_512') returns Spark NLP model sent_small_bert_L2_512\n","nlu.load('en.embed_sentence.small_bert_L4_512') returns Spark NLP model sent_small_bert_L4_512\n","nlu.load('en.embed_sentence.small_bert_L6_512') returns Spark NLP model sent_small_bert_L6_512\n","nlu.load('en.embed_sentence.small_bert_L8_512') returns Spark NLP model sent_small_bert_L8_512\n","nlu.load('en.embed_sentence.small_bert_L10_512') returns Spark NLP model sent_small_bert_L10_512\n","nlu.load('en.embed_sentence.small_bert_L12_512') returns Spark NLP model sent_small_bert_L12_512\n","nlu.load('en.embed_sentence.small_bert_L2_768') returns Spark NLP model sent_small_bert_L2_768\n","nlu.load('en.embed_sentence.small_bert_L4_768') returns Spark NLP model sent_small_bert_L4_768\n","nlu.load('en.embed_sentence.small_bert_L6_768') returns Spark NLP model sent_small_bert_L6_768\n","nlu.load('en.embed_sentence.small_bert_L8_768') returns Spark NLP model sent_small_bert_L8_768\n","nlu.load('en.embed_sentence.small_bert_L10_768') returns Spark NLP model sent_small_bert_L10_768\n","nlu.load('en.embed_sentence.small_bert_L12_768') returns Spark NLP model sent_small_bert_L12_768\n","For language <fi> NLU provides the following Models : \n","nlu.load('fi.embed_sentence') returns Spark NLP model sent_bert_finnish_cased\n","nlu.load('fi.embed_sentence.bert.cased') returns Spark NLP model sent_bert_finnish_cased\n","nlu.load('fi.embed_sentence.bert.uncased') returns Spark NLP model sent_bert_finnish_uncased\n","For language <xx> NLU provides the following Models : \n","nlu.load('xx.embed_sentence') returns Spark NLP model sent_bert_multi_cased\n","nlu.load('xx.embed_sentence.bert') returns Spark NLP model sent_bert_multi_cased\n","nlu.load('xx.embed_sentence.bert.cased') returns Spark NLP model sent_bert_multi_cased\n","nlu.load('xx.embed_sentence.labse') returns Spark NLP model labse\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"IKK_Ii_gjJfF","executionInfo":{"elapsed":4703913,"status":"ok","timestamp":1620193340823,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"},"user_tz":-300},"outputId":"feb3a543-d15e-4a21-d737-a6a33a18fdd3"},"source":["trainable_pipe = nlu.load('en.embed_sentence.small_bert_L12_768 train.classifier')\n","# We need to train longer and user smaller LR for NON-USE based sentence embeddings usually\n","# We could tune the hyperparameters further with hyperparameter tuning methods like gridsearch\n","# Also longer training gives more accuracy\n","trainable_pipe['trainable_classifier_dl'].setMaxEpochs(90)  \n","trainable_pipe['trainable_classifier_dl'].setLr(0.0005) \n","fitted_pipe = trainable_pipe.fit(train_df)\n","# predict with the trainable pipeline on dataset and get predictions\n","preds = fitted_pipe.predict(train_df,output_level='document')\n","\n","#sentence detector that is part of the pipe generates sone NaNs. lets drop them first\n","preds.dropna(inplace=True)\n","print(classification_report(preds['y'], preds['classifier_dl']))\n","\n","#preds"],"execution_count":null,"outputs":[{"output_type":"stream","text":["sent_small_bert_L12_768 download started this may take some time.\n","Approximate size to download 392.9 MB\n","[OK!]\n","sentence_detector_dl download started this may take some time.\n","Approximate size to download 354.6 KB\n","[OK!]\n","              precision    recall  f1-score   support\n","\n","  acceptable       0.88      0.51      0.65      1025\n","        best       0.64      0.97      0.77      1009\n","        good       0.47      0.30      0.37      1008\n","   very good       0.46      0.58      0.51      1006\n","\n","    accuracy                           0.59      4048\n","   macro avg       0.61      0.59      0.57      4048\n","weighted avg       0.61      0.59      0.57      4048\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"_1jxw3GnVGlI"},"source":["#  8. evaluate on Test Data"]},{"cell_type":"code","metadata":{"colab":{"background_save":true},"id":"Fxx4yNkNVGFl","outputId":"88c09d44-b5d8-43fe-85fc-238bbaf593ab"},"source":["preds = fitted_pipe.predict(test_df,output_level='document')\n","\n","#sentence detector that is part of the pipe generates sone NaNs. lets drop them first\n","preds.dropna(inplace=True)\n","print(classification_report(preds['y'], preds['classifier_dl']))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["              precision    recall  f1-score   support\n","\n","  acceptable       0.89      0.56      0.69       240\n","        best       0.65      0.95      0.77       256\n","        good       0.51      0.31      0.39       257\n","   very good       0.47      0.59      0.52       259\n","\n","    accuracy                           0.60      1012\n","   macro avg       0.63      0.60      0.59      1012\n","weighted avg       0.62      0.60      0.59      1012\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"2BB-NwZUoHSe"},"source":["# 9. Lets save the model"]},{"cell_type":"code","metadata":{"id":"eLex095goHwm","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1620193806428,"user_tz":-300,"elapsed":74729,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"}},"outputId":"bb711e14-35b7-4773-f0d8-7211182d7b8b"},"source":["stored_model_path = './models/classifier_dl_trained' \n","fitted_pipe.save(stored_model_path)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Stored model in ./models/classifier_dl_trained\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"e_b2DPd4rCiU"},"source":["# 10. Lets load the model from HDD.\n","This makes Offlien NLU usage possible!   \n","You need to call nlu.load(path=path_to_the_pipe) to load a model/pipeline from disk."]},{"cell_type":"code","metadata":{"id":"SO4uz45MoRgp","colab":{"base_uri":"https://localhost:8080/","height":80},"executionInfo":{"status":"ok","timestamp":1620193822974,"user_tz":-300,"elapsed":16558,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"}},"outputId":"76335160-560f-4c4d-8556-f3ef98622e23"},"source":["hdd_pipe = nlu.load(path=stored_model_path)\n","\n","preds = hdd_pipe.predict('It was one of the best wines i ever tasted .')\n","preds"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/html":["<div>\n","<style scoped>\n","    .dataframe tbody tr th:only-of-type {\n","        vertical-align: middle;\n","    }\n","\n","    .dataframe tbody tr th {\n","        vertical-align: top;\n","    }\n","\n","    .dataframe thead th {\n","        text-align: right;\n","    }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n","  <thead>\n","    <tr style=\"text-align: right;\">\n","      <th></th>\n","      <th>sentence_embedding_from_disk</th>\n","      <th>from_disk_confidence_confidence</th>\n","      <th>origin_index</th>\n","      <th>document</th>\n","      <th>sentence</th>\n","      <th>from_disk</th>\n","      <th>text</th>\n","    </tr>\n","  </thead>\n","  <tbody>\n","    <tr>\n","      <th>0</th>\n","      <td>[[-0.0787801593542099, 0.1528548002243042, 0.1...</td>\n","      <td>[0.9994293]</td>\n","      <td>8589934592</td>\n","      <td>It was one of the best wines i ever tasted .</td>\n","      <td>[It was one of the best wines i ever tasted .]</td>\n","      <td>[best]</td>\n","      <td>It was one of the best wines i ever tasted .</td>\n","    </tr>\n","  </tbody>\n","</table>\n","</div>"],"text/plain":["                        sentence_embedding_from_disk  ...                                          text\n","0  [[-0.0787801593542099, 0.1528548002243042, 0.1...  ...  It was one of the best wines i ever tasted .\n","\n","[1 rows x 7 columns]"]},"metadata":{"tags":[]},"execution_count":12}]},{"cell_type":"code","metadata":{"id":"e0CVlkk9v6Qi","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1620193822977,"user_tz":-300,"elapsed":39,"user":{"displayName":"ahmed lone","photoUrl":"","userId":"02458088882398909889"}},"outputId":"6752328e-cecd-4025-d11e-481e4b574ac3"},"source":["hdd_pipe.print_info()"],"execution_count":null,"outputs":[{"output_type":"stream","text":["The following parameters are configurable for this NLU pipeline (You can copy paste the examples) :\n",">>> pipe['document_assembler'] has settable params:\n","pipe['document_assembler'].setCleanupMode('shrink')                                                    | Info: possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full | Currently set to : shrink\n",">>> pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'] has settable params:\n","pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setExplodeSentences(False)              | Info: whether to explode each sentence into a different row, for better parallelization. Defaults to false. | Currently set to : False\n","pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setStorageRef('SentenceDetectorDLModel_c83c27f46b97')  | Info: storage unique identifier | Currently set to : SentenceDetectorDLModel_c83c27f46b97\n","pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setEncoder(com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@5ea4fd4c)  | Info: Data encoder | Currently set to : com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@5ea4fd4c\n","pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setImpossiblePenultimates(['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e'])  | Info: Impossible penultimates | Currently set to : ['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e']\n","pipe['sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setModelArchitecture('cnn')             | Info: Model architecture (CNN) | Currently set to : cnn\n",">>> pipe['bert_sentence@sent_small_bert_L12_768'] has settable params:\n","pipe['bert_sentence@sent_small_bert_L12_768'].setBatchSize(8)                                          | Info: Size of every batch | Currently set to : 8\n","pipe['bert_sentence@sent_small_bert_L12_768'].setCaseSensitive(False)                                  | Info: whether to ignore case in tokens for embeddings matching | Currently set to : False\n","pipe['bert_sentence@sent_small_bert_L12_768'].setDimension(768)                                        | Info: Number of embedding dimensions | Currently set to : 768\n","pipe['bert_sentence@sent_small_bert_L12_768'].setMaxSentenceLength(128)                                | Info: Max sentence length to process | Currently set to : 128\n","pipe['bert_sentence@sent_small_bert_L12_768'].setIsLong(False)                                         | Info: Use Long type instead of Int type for inputs buffer - Some Bert models require Long instead of Int. | Currently set to : False\n","pipe['bert_sentence@sent_small_bert_L12_768'].setStorageRef('sent_small_bert_L12_768')                 | Info: unique reference name for identification | Currently set to : sent_small_bert_L12_768\n",">>> pipe['classifier_dl@sent_small_bert_L12_768'] has settable params:\n","pipe['classifier_dl@sent_small_bert_L12_768'].setClasses(['very good', 'acceptable', 'best', 'good'])  | Info: get the tags used to trained this ClassifierDLModel | Currently set to : ['very good', 'acceptable', 'best', 'good']\n","pipe['classifier_dl@sent_small_bert_L12_768'].setStorageRef('sent_small_bert_L12_768')                 | Info: unique reference name for identification | Currently set to : sent_small_bert_L12_768\n"],"name":"stdout"}]}]}